Okay, welcome. In this video, Nagat, I would like to cover kind of the background in terms
of world models, the architecture behind these contingent planning algorithms we've seen
before. We've been quite hand wavy about what the states are and so on. And I would like to go
into that a little bit more detail now. So the problem, remember, is we do not know with certainty
in what state the world is in. In the erratic vacuum cleaner example, we didn't know whether
there was dirt in the other room and in the slippery one, we didn't know where the robot was.
Okay, so we were uncertain about the state of the world. So, and that's of course a problem,
because we want to know which actions can I do and what are the effects of this. And theoretically,
we have to augment our model of what a world state could be. So, and the idea I would like to
convince you of today is that we want to make, to track essentially all the possible the states
could be. And we're going to call that a belief state. It's not the world state, but it's the
belief state of the agent, which can have multiple worlds if the world is either
non-deterministic or only partially observable. And the idea here in this architecture,
agent architecture, is that we basically say instead of keeping track of the single world state,
we kind of keep track of the set of possible world states. So for that, we have, we call that
one belief state, and we need a transition model, no longer of world states, but of belief states,
basically update the belief states based on the actions and the sensory information we get.
So the environment still determines what the world model can be, but we don't really,
it doesn't pinpoint it down to one state. So in a fully observable deterministic environments,
we're back where we were before. We can observe the initial state and the subsequent states.
They're given by action. So I can predict, since I know the initial state, I can predict
since I know the initial states, and we have deterministic actions, I can from that predict
all the possible world states. And the belief state, thus, is a single set. We call it still
the member of this set, the world state. And instead of us, and the transition model basically
is a function from states and actions to states. We call that the transition function, which is
which is exactly what we had before. So basically, if we think about this, let's kind of look back
at what we did. So we first looked at search based agents. They basically we assumed a fully
observable deterministic environment. So the world state is the current state. And since we had atomic
states, we didn't have any inference. In the CSP based agents, we had a fully observable again,
deterministic environment, the world state was a constraint network. So rather than just having
single states, we have constraint networks. And even though we're in a situation where we have
a single world state, we use a representation language, the constraint network, to represent
multiple states. But that was not because the environment told us so. But basically, we wanted
to do inference, which basically says we wanted to do macro steps, reasoning from sets of states to
sets of states. And the inference there was constraint propagation. The same thing essentially
was with logic based agent, we still have a fully observable deterministic environment, the world
state is given as a logical formula. And inference was basically something like DPLL or resolution
or tableau or something like this. And finally, we had planning agents, still fully observable
deterministic environment, the world state is essentially a propositional logic formula. But
the transition model is the one given by strips. We're doing add lists, delete lists, and so on,
and preconditions, which basically tell us what, how the world changes in planning.
And the inference, of course, was state or plan space search. Is this all we could do?
And the interesting thing is that if we drop the one of being fully observable or stochastic,
if we have a stochastic environment, which is exactly what these vacuum cleaners, the erratic
and slippery one actually gave us in the last video nugget, then we have sets of possible states.
We have to generalize the transition function to a transition relation, and that's something we're
going to look at now. And of course, this even applies to online problem solving. Remember,
we had this distinction of online problem solving, where you base offline problem solving,
where you basically pre-compute the whole plan or strategy, and then the agent just had to basically
take the plan or strategy on board and then execute it. And even in online problem solving,
Presenters
Zugänglich über
Offener Zugang
Dauer
00:10:41 Min
Aufnahmedatum
2021-01-31
Hochgeladen am
2021-01-31 19:09:07
Sprache
en-US